Description

You’ve decided to open a small robot-run cafe in Los Angeles. The project is promising but expensive, so you and your partners decide to try to attract investors. They’re interested in the current market conditions—will you be able to maintain your success when the novelty of robot waiters wears off? You’re an analytics guru, so your partners have asked you to prepare some market research. You have open-source data on restaurants in LA.

Link to presentation https://drive.google.com/file/d/1TVUoAm0YTSeMQf9cgDr5X0IuSHaHDvxh/view?usp=sharing

Preprocessing data

Investigate the proportions of the various types of establishments. Plot a graph.

The majority of our data contain info about Restaurant types, while the least of our data contain info about Bakery types.

Investigate the proportions of chain and nonchain establishments. Plot a graph.

62% of the data is non chain establishments, while 38% are chains.

Which type of establishment is typically a chain?

Bakery is the establishment type which is typically a chain, as it is the only establishment which has no instances where it isn't a chain.

What characterizes chains: many establishments with a small number of seats or a few establishments with a lot of seats?

There's an average of 40 seats for each establishment which indicate that there are few establishment with a lot of seats as a characterization of chains. We can also see that there are not a lot of establishments per chain, while average seats per chain is high at 53 seats.

Determine the average number of seats for each type of restaurant. On average, which type of restaurant has the greatest number of seats? Plot graphs.

Restaurants has the greatest number of seats.

Put the data on street names from the address column in a separate column.

Plot a graph of the top ten streets by number of restaurants.

Sunset BLVD has the highest number of restaurants at close to 400, followed by Wilshire BLVD and Pico BLVD, with 397 and 370 restaurants respectively. Number 10 on the list is Hollywood BLVD with 253 restaurants.

Find the number of streets that only have one restaurant.

There are 250 street with only 1 restaurant.

From the above we can clearly see a positive correlation in the number of the restaurant with the number of seats. The higher the number of restaurants in the street, there is an increase in the number of seats.

We can see the street names where the number of seats are outliers in our data.

In conclusion, I inspected the data and viewed 3 Nan values in the chain field, which I removed as they were irrelevant to my task. There were no duplicates.

I viewed that 75% of the data is comprised of restaurants and 11% of fast food establishments, in addition 61% of the data is related to chains.

I found that bakeries in the data is only appearing as a chain. In addition I saw that chains tends to have fewer establishments with many seats, with restaurants having the highest average number of seats.

Lastly I inspected the 10 top streets in terms of establismnets count, and also inspected that 250 streets have 1 restaurant, and found that there is a positive correlation between the number of establishments on the street, and the number of seats.

Cafe establishments have 25 seats on average and consist of about 50% chains.

Per the above conclusion it may be recommended to open an average size cafe (25 seats) in a location which is not very packed with restaurants, but also don't have only 1. I recommed looking to open on streets which appear on the top 10 streets but also do not appear as an 3rd quarter outlier on the boxplot above, as it may indicate of overcrowding.